Skip to content

Conversation

@obenland
Copy link
Member

@obenland obenland commented Oct 13, 2025

Proposed changes:

Implements a dual storage strategy where Mastodon imports continue using WordPress attachment posts while
ActivityPub inbox items use direct file storage in uploads/activitypub/ap_posts/{post_id}/.

New methods added:

  • import_files() - Public method for direct file import (used by ap_posts)
  • save_file() - Downloads and saves files to uploads/activitypub/ap_posts/{post_id}/, returns array with
    url, mime_type, and alt keys
  • import_inline_files() - Processes inline images for file storage
  • append_files_to_content() - Appends file-based media to content
  • generate_files_markup() - Creates block markup from file data
  • get_files_gallery_block() - Gallery block for file-based attachments
  • delete_directory() - Public helper to delete post's file directory

Methods removed:

  • maybe_hide_from_media_library() - No longer needed, Mastodon import attachments should be visible in the
    user's media library
  • has_updated_attachments() - Simplified to inline empty check

Key improvements:

  • Uses WP_Filesystem API exclusively for all file operations
  • Only handles remote URLs (no local file logic needed for inbox items)
  • Public $ap_posts_dir property for consistent path references
  • Removed unused function parameters
  • Well-documented return types with hash notation for array keys
  • Simple folder-per-post structure for easy cleanup

Other information:

  • Have you written new tests for your changes, if applicable?

Testing instructions:

  • All 1212 tests passing
  • Tests updated to verify file creation instead of attachment posts
  • Test cleanup properly removes activitypub directories
  • Mastodon import tests still pass (no regression)
  • Tests use public $ap_posts_dir property for path consistency

Technical details:
The implementation uses a simple folder-per-post structure where each ap_post gets its own directory at
uploads/activitypub/ap_posts/{post_id}/. When a post is deleted, the entire directory is recursively removed
using WP_Filesystem::delete().

Files are tracked by the directory structure itself rather than post meta, keeping the implementation simple
and avoiding the need for complex deduplication logic.

Changelog entry

  • Automatically create a changelog entry from the details below.
Changelog Entry Details

Significance

  • Minor

Type

  • Changed - for changes in existing functionality

Message

Use direct file storage for ActivityPub inbox attachments instead of creating WordPress attachment posts.
Mastodon imports continue using attachment posts as before.

pfefferle and others added 30 commits October 13, 2025 10:45
Introduces a new 'ap_object' post type and related taxonomies for handling ActivityPub objects, including registration, CRUD operations, and taxonomy management. Updates create and update handlers to process non-interaction objects using the new Objects collection. Enhances debug functionality to support the new post type and taxonomies.
Introduced Sanitize::content() to process and format content for ActivityPub, including block support and HTML sanitization. Updated Objects class to use the new sanitizer for post content.
Introduces a new Blocks::html_to_blocks() method to convert HTML content into block format using DOMDocument parsing and block mapping. Refactors Sanitize to use this new method, replacing the previous regex-based paragraph splitting logic for improved accuracy and maintainability.
The get_node_attributes private method was removed and the logic for setting the 'ordered' attribute on 'ol' nodes is now handled inline. This simplifies the code by reducing indirection and focusing only on the required attribute for ordered lists.
Refactored the Blocks class by renaming the html_to_blocks method to convert_from_html for improved clarity. Updated all references to use the new method name.
Introduces new PHPUnit tests for content sanitization in Sanitize::content, covering scenarios such as block support, malicious content, URLs, empty content, and safe HTML preservation. Also adds a test for Blocks::convert_from_html to verify HTML-to-block conversion when blocks are supported.
Introduces tests to verify content sanitization in Create handler, handling of private activities, and behavior with malformed object data. Also ensures required post types are registered during test setup.
Introduces a comprehensive PHPUnit test suite for the Activitypub\Collection\Objects class, covering object addition, updating, retrieval by GUID, and activity-to-post conversion, including edge cases and error handling. Mocks HTTP requests for remote actor fetching and ensures correct post type registration for test isolation.
Renamed update_actor and update_object to handle_actor_update and handle_object_update for improved clarity and consistency. Combined logic for handling object and interaction updates into handle_object_update. Updated corresponding test method names to match the new handler method names.
Changed the PHPUnit @Covers annotation from ::update_actor to ::handle_actor_update in Test_Update to accurately reflect the method being tested.
Changed the post type labels in the registration array from 'Posts'/'Post' to 'Objects'/'Object' to better reflect the content type.
Inserted a blank line after the use statements in class-update.php to improve code readability and adhere to coding standards.
Deleted the Blocks::convert_from_html method and its related test, and updated the Sanitize class to no longer convert HTML content to blocks. This simplifies content sanitization by removing automatic block conversion.
Refactored the Objects collection to Posts, updating class names, constants, and references in all relevant files. Adjusted post type from 'ap_object' to 'ap_post' and updated related taxonomy registrations, handlers, and tests to reflect the new naming convention. This improves clarity and consistency in the codebase.
Refactored the method name from create_object to create_post to better reflect its purpose of handling post creation. Updated all relevant references and docblocks for clarity.
Changed the @Covers annotation from ::create_object to ::create_post in the test_handle_create_object_with_sanitization method to accurately reflect the method being tested.
Adds registration of the '_activitypub_remote_actor_id' post meta for the custom post type, storing the local ID of the remote actor that created the object. This includes type enforcement, a description, and a sanitize callback.
The 'ap_object_type' taxonomy has been added to the registered taxonomies array for the relevant post type, allowing posts to be associated with both 'ap_tag' and 'ap_object_type'.
Deleted the 'labels' arrays from the registration of 'ap_tag' and 'ap_object_type' taxonomies. This will cause WordPress to use default labels for these taxonomies.
Eliminated the 'rewrite' arguments for 'ap_tag' and 'ap_object_type' taxonomies, reverting to default rewrite behavior. This may affect the URLs generated for these taxonomies.
Refactors the method name from handle_actor_update to update_actor for clarity and updates all references accordingly.
Renamed the test method from test_handle_actor_update to test_update_actor and updated the @Covers annotation and method calls to reference update_actor instead of handle_actor_update. This aligns the test with the updated function name.
Initializes $result as WP_Error by default and updates the docblock for the 'activitypub_handled_update' action to specify more precise types for the $result parameter. Removes unreachable assignment in the comment update branch.
Adds explicit WP_Error objects when Create or Update operations fail, ensuring consistent error reporting. Also updates the docblock for the Create handler to reflect possible result types.
Moved the default WP_Error assignment to the start of handle_object_update, ensuring $result is always initialized. Removed redundant error assignment after update attempts.
@obenland obenland requested a review from Copilot October 27, 2025 20:29
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Replace mime_content_type() with WordPress's wp_check_filetype_and_ext()
to properly validate file types and detect mime types using WordPress
standards. This provides better security by validating that file contents
match their extensions and prevents mime type spoofing.
Replaces the 'delete_attachments_with_post' method with 'delete_directory' for post deletion cleanup. Renames media and file append helper methods for clarity and consistency.
@obenland obenland requested a review from pfefferle October 28, 2025 13:27
@Jiwoon-Kim

This comment was marked as off-topic.

- Replace generic import_files() with type-specific import_post_files()
- Move orchestration logic to private import_files_for_object()
- Make $object_type a required parameter in all helper methods
- Add helper methods for content and storage path abstraction:
  - get_storage_paths() for directory path handling
  - get_object_content() for content retrieval
  - update_object_content() for content updates
- Add public $comments_dir property for future comment support
- Update call sites to use new import_post_files() method
- Add cache clearing after attachment import to ensure fresh post data

This refactoring makes it trivial to add comment attachment support in
the future by adding import_comment_files() that calls the same private
orchestration method.
@pfefferle
Copy link
Member

@Jiwoon-Kim this is not directly related to this PR, can you please start this as discussion instead, to keep the focus on the code changes.

@obenland
Copy link
Member Author

@pfefferle This should be ready for another review


/**
* Import attachments from an ActivityPub object and attach them to a post.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would at least mention that this is used by the importer!?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you tell me more about that? We don't usually document callers, what makes this different?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always stumble upon that function, because it still has an unprefixed/unsuffixed name import but it is very specific to the post import used by the mastodon import. I think it would be nice to give a hint about these two differences and when to use which function.


// Process attachments if present.
if ( ! empty( $activity_object['attachment'] ) ) {
Attachments::import_post_files( $activity_object['attachment'], $post_id );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to provide an option to unhook local storage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, that's a good question. Maybe it should be one that allows disabling the creation of ap_posts entirely?
I'd probably consider sideloading attachments part of the core functionality of the "save incoming posts" feature.

Copy link
Member

@pfefferle pfefferle Oct 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes no sense to disable the creation of ap_posts entirely, because this is literally not using the follow functionality. But I think it is a valid usecase for users with limited webspace, to not store the images locally but to stick with hotlinking.

(that is why I proposed hooking into save/update post hooks btw.)

@Jiwoon-Kim
Copy link
Contributor

Conditional Media Handling Architecture: Differentiating Upload Paths and Managing Non-Attached ActivityStreams Objects
#2382 (comment)

@obenland obenland merged commit e483fc8 into trunk Oct 29, 2025
12 of 14 checks passed
@obenland obenland deleted the attachment-handling branch October 29, 2025 13:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants